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Abstract. In [GW09a] we began an investigation of the following general question. Let 
L\, . . . , L m be a system of linear forms in d variables on F™, and let A be a subset of F™ 
of positive density. Under what circumstances can one prove that A contains roughly the 
same number of m-tuplcs L\{x\, . . . , x c i), ■ ■ ■ , L m (xi, . . . , x c i) with x\, . . . , Xd S F™ as a 
typical random set of the same density? Experience with arithmetic progressions suggests 
that an appropriate assumption is that ||^4 — <5l||[/fc should be small, where we have written 
A for the characteristic function of the set A, 5 is the density of A, fc is some parameter 
that depends on the linear forms L\, . . . , L m , and \\-\\jjk is the fcth uniformity norm. The 
question we investigated was how fc depends on L\, . . . , L m . Our main result was that 
there were systems of forms where k could be taken to be 2 even though there was no 
simple proof of this fact using the Cauchy-Schwarz inequality. Based on this result and 
its proof, we conjectured that uniformity of degree k — 1 is a sufficient condition if and 
only if the fcth powers of the linear forms are linearly independent. In this paper we prove 
this conjecture, provided only that p is sufficiently large. (It is easy to see that some such 
restriction is needed.) This result represents one of the first applications of the recent 
inverse theorem for the U k norm over by Bcrgclson, Tao and Ziegler |TZ08a[ IBTZ09) . 
We combine this result with some abstract arguments in order to prove that a bounded 
function can be expressed as a sum of polynomial phases and a part that is small in the 
appropriate uniformity norm. The precise form of this decomposition theorem is critical 
to our proof, and the theorem itself may be of independent interest. 
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1. Introduction 

In [GW09aj we investigated which systems of linear equations have the property that 
any uniform subset of F™ contains the "expected" number of solutions. By the "expected" 
number we mean the number of solutions one would expect in a random subset of the 
same density, and by a "uniform subset of F™" we mean a set A of density 5 such that 
|| A — 51\\u 2 is small, where A is the characteristic function of A. More generally, we asked 
the same question with the U 2 norm replaced by any other U k norm. Note that the U k 
norms increase as k increases, so the condition that \\A — 51\\uk is small becomes stronger, 
and there are more sets of linear forms for which it is sufficient. 

This question arises naturally in the context of Szemeredi's theorem. If x > • • • > x k-i 
satisfy the equations Xj — 2x i+1 + x i+2 = for i = 0, 1, 2, . . . , k — 3, then they lie in an 
arithmetic progression (in the sense that there exists d G ¥ p such that Xj = Xq + id for 
each i). It was shown in |G01] that if \[A — 8V\uk-i is small, then A contains roughly the 
number of arithmetic progressions of length k that you would expect if the elements of A 
had been selected randomly and independently with probability 5. (More precisely, this 
was shown in Zjy rather than F™, but the proof carries over very easily.) The proof used 
multiple applications of the Cauchy-Schwarz inequality. Moreover, this result is sharp, 
in the sense that \\A — 51\\ U k~2 can be small without A containing roughly the expected 
number of progressions of length k. 

In their investigations of solutions of linear equations in the primes, Green and Tao 
|GrT06] worked out the most general result that could be proved using this kind of ap- 
proach. Note first that by parametrizing the set of solutions to a system of linear equations 
one can talk equivalently about systems of linear forms. For instance, instead of the equa- 
tions Xi — 2xi + i + Xi + 2 = for i = 0, 1,2,..., A; — 3 mentioned above one can look at the 
system of linear forms x, x + y, x + 2y, . . . , x + (k — l)y. Green and Tao defined a notion 
of "complexity" for a system of linear forms in d variables x±, . . . , Xd, and proved that for 
a system L 1; . . . , L m of complexity k you will get roughly the expected number of images 
Li(x\, . . . , Xd), ■ ■ ■ , L m (x\, . . . , Xd) in A provided that \\A — 81\\uk+i is small. However, 
if one also works out the most general result that can be obtained by straightforwardly 
adapting the examples that prove that the U k ~ 1 norm is needed for progressions of length 
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k, then a discrepancy emerges. It is easy to show that if the functions L k , . . . , L k m are 
linearly dependent, then there exists A such that \A — b\\yk is small but A does not 
have roughly the expected number of solutions. However, there are systems of linear forms 
that have complexity k while the functions L\, . . . , L k m are linearly independent, and the 
easy arguments do not tell us how they behave. The main result of |GW09a] was that 
for at least some such systems it is enough for \A — to be small. More specifically, 

we showed that there are systems of equations of complexity 2 such that it is enough to 
assume that \A — 51||c/ 2 is small, whereas a direct application of the argument of Green 
and Tao would require \A — 5l||j/3 to be small. 

To state our result in a concise way, we defined the true complexity of a system of linear 
equations in d variables to be the smallest k with the following property. For every 77 > 
there exists e > such that for every 5 G [0, 1] and every subset A C F" of density 5, if 
|| A— 5l\\tjk+i < e then p~ nd times the number of m-tuples Lx(xi, . . . , Xj), . . . , L m (xi, . . . , Xa) 
in A lies within rj of what one would expect in the random case (assuming that there are 
no degeneracies). To distinguish our notion of complexity from that of Green and Tao, we 
referred to theirs as Cauchy-Schwarz complexity. 

Theorem 1.1. |G W09a] Let Li,...,L m be a system of linear forms in d variables of 
Cauchy-Schwarz complexity at most 2. Suppose that the functions Lf,..., l? m are linearly 
independent. Then the linear system L 1; . . . , L m has true complexity 1. 

In the light of this result, we made the following natural conjecture. 

Conjecture 1.2. |GW09a] The true complexity of a linear system L±, . . . , L m is the least 
integer k such that the forms L\ +l , L k+1 , . . . , L 1 ^ 1 are linearly independent. 

A statement in ergodic theory analogous to Conjecture 11.21 was proved by Leibman 
|Lei07] independently of our work in |GW09a] and at about the same time. However, 
there does not appear to be a correspondence principle that would enable one to deduce 
Conjecture 11.21 itself from his results. 

Let us be slightly more precise about what it means for the forms L k+1 to be linearly 
independent. A linear form L on F™ in d variables is a function of the form L(x±, . . . , Xd) = 
c\X\ + - ■ -+CdXd- Here, the variables x« are elements of Fp and the coefficients q belong to F p . 
Clearly, a linear form just depends on its coefficients (ci, . . . , q), so we can view a system 
Li, . . . , L m of linear forms on F™ as a system of linear forms on F p , in which case they take 
values in F p . We say that a system of linear forms Li, . . . , L m is degree-k independent if 
the functions L k ,..., L k m are linearly independent when L\, . . . , L m are viewed as functions 
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from (¥ p ) d to ¥ p . When k = 2 we shall also call them square independent and when k = 3 
we shall call them cube independent. 

The present paper is the second of three papers that elaborate in different ways on the 
main result of |GW09a] . The first one |GW09bj obtains significantly improved bounds for 
Theorem 11.11 above, while the third |GW09c] adapts the methods used in the context of 
F™ to the technically more challenging setting of Zjv, also obtaining respectable bounds. 
The purpose of this paper is to prove Conjecture 11.21 in F™, at least when p is sufficiently 
large. (The precise condition we need is that p should be larger than the Cauchy-Schwarz 
complexity of the system of linear forms. The reader may have noticed that we have 
not defined Cauchy-Schwarz complexity. That is because in this paper we do not use the 
notion in a detailed way: all we do is quote a lemma that uses a bound on Cauchy-Schwarz 
complexity as a hypothesis. The definition can be found in |GW09bj .) 

Let us briefly recall the structure of the proof of Theorem 11.11 in |GW09aj . First of 
all, it is not hard to prove the following equivalent condition for a system to have true 
complexity k. 



Lemma 1.3. A system of linear forms Li, . . . , L m in d variables xi,...,xj, has true com- 
plexity k if and only if the following statement holds. For every e > there exists c > 
such that if f : F™ — >■ C is any function with \\f\\oo < 1 and \\f\\u k +^ — c , an d if E is any 
non-empty subset of {1,2, ... , m}, then 



\f(Li(xi,...,x d )) 

ieE 



< e. 



In order to prove this for square-independent systems of Cauchy-Schwarz complexity 2, 
we decomposed the bounded function / into three bounded parts fx + ji + fa. The first 
part was "quadratically structured," in a certain sense that allowed us to carry out explicit 
calculations in order to estimate the quantity E xij „ ^ YYiLi fi{Ei(x\, . . . , Xd))- The second 
was "quadratically uniform," which means simply that H^Hc/ 3 is small. The third was 
small in L 2 . To do this, we quoted a structure theorem of Green and Tao |Gr06j . which is 
a consequence of the inverse theorem for the U 3 norm |GrT08aj . 

When evaluating the average 

m 

E xe(F ? yY[f{Li(x)), 

i=i 
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we obtained a sum of 3 m terms. Because fi, fi and fz were bounded, any term involving / 3 
was small by the Cauchy-Schwarz inequality The results of Green and Tao about Cauchy- 
Schwarz complexity guaranteed that any term involving f'2 was small as well. We were 
therefore left needing to estimate E Ili ..^ [j YYu=i h{Li{xi, . . . ,Xd), which, as we mentioned 
above, could be done by means of an explicit calculation. First, we observed that if / is 
linearly uniform, meaning that ||/||[/2 is small, then so is the function fx that comes out of 
the structure theorem of Green and Tao. We then did the calculation and discovered that 
if the functions Lf were linearly independent, then this term too was small. 

We gave an outline in the remarks of that paper of how we thought a proof of Conjecture 
11.21 might proceed. Given a linear system of Cauchy-Schwarz complexity k, we would 
need to be able to write a bounded function / as a sum g + h, where g has "polynomial 
structure" of degree k and h is uniform of degree k. However, such a decomposition theorem 
necessarily requires an inverse theorem for the U k+1 norm over F™, which for k > 2 had 
not been proved at the time that |GW09aj was written. 

Since then, an inverse theorem for the U k norm for functions defined on F™, which we shall 
state formally in the next section, has been proved by Bergelson, Tao and Ziegler |BTZ09t 
ITZ08a] . Because of this, it has become feasible to prove Conjecture 11.21 However, proving 
the conjecture is not simply a matter of using this new theorem and straightforwardly 
generalizing our other arguments. Instead, we have to do some work to formulate and 
develop a usable decomposition theorem. To do this, we follow a different method from 
the one in |GW09aj . which we introduced in |GW09bj . The decomposition theorem of 
Green and Tao is inspired by arguments in ergodic theory and proved using averaging 
projections and energy-increment arguments. But for technical reasons it is not obvious 
how to generalize that approach to the cubic and higher-order cases. (It is not hard to 
obtain decompositions, but to be useful a decomposition has to have further properties: it 
is here that the difficulty lies.) In |GW09bj we used the Hahn-Banach theorem to obtain 
decomposition results that are more in the spirit of Fourier analysis, and that is what we 
shall do here. Again, the generalization is not straightforward. Perhaps the main difficulty 
is that the notion of the rank of a bilinear form, which is crucial to our earlier arguments, 
does not have an obvious analogue for multilinear forms. 

In Section El we briefly outline the strategy for systems L 1; . . . , L m of Cauchy-Schwarz 
complexity 3. By the results of Green and Tao, such systems have true complexity at 
most 3. This gives us two separate cases to consider when we are trying to prove that the 
expression E xlt ... tXd []™ 1 f{U{xi, . . . , x d )) is small. 
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In the first case, we may assume that the linear system L 1; . . . , L m is cube independent 
and that / is highly quadratically uniform. Here we decompose / as a sum fi + f 2 + fz 
such that j\ has cubic structure, f 2 is small in U A , and fs is small in an L p -sense that 
we shall not specify exactly here. Because the system has Cauchy-Schwarz complexity 3, 
any average involving f 2 is negligible, boundedness allows us to deal with terms involving 
fs, and an explicit computation of the average over the structured part uses the cube 
independence and quadratic uniformity of /. This is a straightforward generalization of 
the argument in |GW09bj . 

The second case is more complicated and already encapsulates all the difficulties that 
arise in the general case. Here we may assume that the system L\, . . . , L m is square in- 
dependent and that / is highly linearly uniform. The difference between this case and 
Theorem 11.11 is that now we have the weaker hypothesis that the Cauchy-Schwarz com- 
plexity is at most 3. Briefly, this forces us to consider not just quadratically structured 
terms but cubically structured terms as well. We start off by decomposing / as a sum 
fi + fi + /3 such that fi has quadratic structure, f 2 is small in U 3 , and f'3 is small in 
Li, but this time we have to decompose the quadratically uniform part f 2 further into a 
sum f 2 + g 2 + h 2 , where f 2 has cubic structure, g 2 is small in U 4 and h 2 is small in an L\ 
sense. As before, any average involving g 2 as a factor is easily shown to be negligible by 
Cauchy-Schwarz. The computation involving the structured parts can be performed with- 
out too much difficulty, with the help of the fact that a system that is square independent 
is necessarily cube independent. 

In order to get this approach to work, it is very important that the parameters involved 
in the error estimates should depend on each other in the right way. At various points we 
require the polynomial phases to have high rank (once we have decided what that means), 
and the uniform part to be arbitrarily small as a function of a certain type of "complexity" 
of the structured part. Finally, we need the uniform part to remain bounded in while 
satisfying the preceding requirements. 

We expect the resulting decomposition theorems to be of independent interest. Since 
they have the flavour of arithmetic-regularity-type decompositions they necessarily result in 
tower-type bounds, even in simple cases. However, since the bound in the inverse theorem 
that we use in the proof is not explicit (and if made explicit in the current state, would 
certainly be far worse than tower type), there is not much reason to struggle to obtain 
better bounds. If, however, better bounds are discovered for the inverse theorem, it might 
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be worth revisiting the arguments of this paper to try to obtain bounds more like those in 
|GW09bj . 

Recently, Green, Tao and Ziegler proved a long-awaited inverse theorem for the U k norm 
for functions defined on (the case k = 4 appears in [GrTZ09] ). thereby raising the 
possibility of proving Conjecture 11.21 in full for Zjv- As we were on the point of submitting 
this paper, Green and Tao did indeed do this |GrT10j . which means that, at least from a 
qualitative point of view, the programme of which this paper forms a part is now complete. 

2. Inverse and decomposition theorems 

As we outlined in the introduction, even the simplest possible generalization of Theorem 
11.11 to the case where the system of linear forms is cube independent and has Cauchy- 
Schwarz complexity at most 3 requires requires an inverse theorem for the U 4 norm. 

An inverse theorem for the U 3 norm was proved by Green and Tao [GrT08a] for p > 2 
and by Samorodnitsky |S07j for p = 2. We write u for exp(27ri/p). 

Theorem 2.1. Let < 5 < 1 and let p be a prime. Let f : F" — > C be a function with 
\\f\\oo — 1 an d II /lie/ 3 — <5- Then there exists a quadratic polynomial q : F™ — > ¥ p and a 
constant 7(5) such that 

\E xeW nf(x)u«W\>j(5). 

It was conjectured that this result should hold for higher U k norms; in particular, a func- 
tion that is large in the U k+l norm ought to correlate with a polynomial phase function of 
degree k. It was recently shown independently in [GrT07] and |LMS08] that the conjecture 
is false in this generality. In particular, explicit counterexamples were given in the case 
p = 2, and more generally when the degree d of the polynomials involved exceeds the char- 
acteristic p of the underlying field. However, even after these examples it was reasonable 
to believe that the conjecture was true whenever the characteristic p was sufficiently large, 
and this was eventually proved by Bergelson, Tao and Ziegler [BTZ09} ITZ08a| . 

Theorem 2.2. Let < 5 < 1 and let p be a prime. Let f : F™ — > C be a function with 
ll/lloo < 1 an d \\f\\u d+1 — Then there exists a polynomial 11 : F™ — > ¥ p of degree d and a 
constant 7(5) such that 

\-E x ewnf(x)u^\ ><y(S), 

provided that p > d. 

In the case of low characteristic it was observed by Bergelson, Tao and Ziegler that the 
customary notion of a polynomial phase function, defined to be exp(27ri7r(x)/p) for some 
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polynomial tt, was not appropriate. The problem is that such functions are not the most 
general multiplicative Freiman homomorphisms that one can define on F": it turns out 
that there are other ones that involve p k th roots of unity. By adopting a more general 
and more natural definition of a polynomial phase function, Bergelson, Tao and Ziegler 
were able to prove that even when p < d, a function / that exhibits large U d+1 norm 
correlates with a (multiplicative) polynomial phase of degree c(d), where c is a function of 
d. However, they did not show that c{d) could be taken to equal d, so the modified inverse 
conjecture is not quite completely proved for low characteristic. 

In the light of this, we shall assume in the remainder of this paper that p is sufficiently 
large. In particular, we shall assume that p exceeds the Cauchy-Schwarz complexity of the 
linear system that is being investigated. 

Now let us turn to a general discussion of how to use inverse theorems to prove that a 
bounded function can be decomposed into a structured part, a uniform part and a small 
part. We shall use the following abstract result, which is Theorem 5.7 of [G08] . It is a 
general "arithmetic regularity lemma" of a kind that was introduced by Green |Gr05j . and 
is also closely related to Theorem 3.5 of [T06j . However, the proof given in [G08] is quite 
different: like the argument used to prove the decomposition theorem in [GW09bj it is 
based on the Hahn-Banach theorem. 

Theorem 2.3. Let \\.\\ be a norm on IR n and let $ C R n be a set of functions satisfying 
the following properties for some strictly increasing function c : (0, 1] — > (0, 1]: 

• $ contains the constant function 1, $ = — $, ||0||oo — 1 f or every <p G $, and the 
linear span of $ is R n ; 

• (f, <P) < 1 for every f with \\f\\ < 1 and every <f) G 

• if ll/lloo < 1 and \\f\\ > e, then there exists G $ such that (f,<f>) > c(e). 

Let e > and let i] : R + — > M + be a strictly decreasing function. Then there is a constant 
Mq, depending only on e and the functions c and rj, such that every function f G M n that 
takes values in [0, 1] can be decomposed as a sum f\ + fi + f'3, with the following properties: 

• the functions f\ and f\ + f 3 take values in [0, 1]; 

• fx is of the form £\ Xiipi, where Y2i M = M < Mq and each ipi is a product of 
functions in $; 

• \\f2\\<v(M); 

• ll/3||2<e. 
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Let us make a few remarks about this theorem. Roughly speaking, we shall take $ to be 
the set of polynomial phase functions of a certain degree (but there is a small technicality 
in that these take complex rather than real values) and |.| will be the U k+l norm. The first 
two properties will then be easy to check, and the third is the inverse theorem. Theorem 
12.31 will then tell us that that we can decompose an arbitrary function into a sum of degree- 
k polynomial phase functions, a function with very small U k+l norm and a function with 
small L 2 norm. Another small technicality is that we shall be interested in functions that 
take values in an interval [— C, C], but this again is easily dealt with. 

Before we can proceed we shall need to know a little more about polynomial phase 
functions. Given a polynomial it of degree d we define the associated <i-linear form k by 
the formula 

K (h 1 ,...,h d )= (-i^'^M), 

£g{0,l} d 

where e.h is shorthand for ^ and |e| is shorthand for e± + • • • + e d - Let us note some 
simple facts about k. (These are well known but we include proofs for the convenience of 
the reader.) 

Lemma 2.4. The function k just defined is a symmetric d-linear form on F2. Moreover, 
for every x G F™ we have the equality 

n{h 1 ,...,h d )= Y, (-1)""'^+^). 

<EG{0,l} d 

Proof. We prove both results by induction on d. The base cases both follow from the fact 
that if 7r is a polynomial of degree 1, then ir(x + a) —ir(x) is a homogeneous linear function 
of a that does not depend on x. 

To prove the first statement, note first that it is clear from the definition that the value 
of K(hi, . . . , hd) is unaffected if one permutes the variables hi, ... , hd- Next, let us reexpress 
K(h u . . . ,h d ) as 

£ (_ 1 )rf-i-kl( 7r ( ei / ll + . . . + e d ^h d -i + h d ) - niexh + ■■■ + 6^^)). 

ee{0,l} d " 1 

For each fixed h d the function x i— > tt(x + h d ) — tt(x) is a polynomial of degree d — 1 in 
x. Therefore, by the inductive hypothesis, for each fixed h d the form n(hi, . . . , h d -i) is 
(d — l)-linear (and symmetric). By symmetry, the dependence on h d is linear for fixed 
hi, ... , hd-i- 
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The second part is proved in a very similar way. Let us define K x (hi, . . . , h d ) to be 
Xlee{o l) d ~' e 'vr(x + e.h). Then we can reexpress K x (h±, ■ ■ ■ , hd) as 

(— l) d_1_|e| (7r(cr + e x h x + •■■ + e A _ x h d _ x + h d ) - n(x + e x h x + ■■■ + Cd _i/i d _i)). 

eG{0,l} d - 1 

Again, for each fixed hd the function x t— > n(x + hd) —ir(x) is a polynomial of degree d— 1 in 
x. Therefore, by induction, for each fixed hd we have the desired equality K x (h±, . . . , hd) = 
n(hi, . . . , hd), which of course implies that the equality always holds. □ 

We are now in a position to evaluate the U k+1 norm of a degree ^-polynomial as well as 
its U k+1 dual norm. 

Lemma 2.5. Let it : F™ — y ¥ p be a polynomial of degree k and let g be the polynomial 
phase function u n . Then \\g\\u k + 1 — IMI/yfc+i = 1- 

Proof. Lemma EH implies, amongst other things, that the identity J2ee{o l}^ 1 ( — 1) ^(^ + 
e.h) — holds for any x G F™, any h G (F™) fc+1 and any polynomial n of degree at most k. 
It follows immediately that ||g||c/fc+i = 1. Next we turn our attention to the U k dual norm. 

The generalized Cauchy-Schwarz inequality for the uniformity norms states that if for 
each e G {0, l} fc+1 we have a function f e : F™ — >■ C, then 



E xM ,..., hk J] C^f e (x + e.h) 

ee{0,l} fc+1 



< WfeWu^- 



eG{0,l} k+1 

Here C' £ ' is the operation of taking the complex conjugate |e| times. A proof of this 
inequality (for Z^v rather than F" but the argument is identical) can be found in |G01j . 

We apply this result with f = / and f e = g = u 7T for every other e G {0, l} fc+1 . The 
identity Xlee{o i}M-i( — l)' e ' 7r (a ; + c-h) = implies that for this choice of functions we have 

eG{0,l} fe+1 

since the product C^co n( - x+€ - h ^ equals u)~^ x > . Also, ||^||j7fc+i < ||g||oo = 1- Therefore, 
we find that (/, g) < \\f\\u k +A\9\\u k +^~ = ll/llc/ fe+1 - Since / was arbitrary, it follows that 
IMIf/k+i < 1) as claimed. If we take / = g then the same identity implies that (/, g) = 1, 
which implies that HgHj^fc+i = 1. (It is of course just the fact that ||<7||^fc+i < 1 that we 
shall actually use.) □ 
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Now we are ready to state and prove a deduction from Theorem 12.31 that will be an 
important tool for us later. 

Corollary 2.6. Let e > and letrj : M + — > M + be a strictly decreasing function. Then there 
is a constant M = M (e, t]) such that every function f : F™ — > [—1, 1] can be decomposed 
as a sum fi + f'2 + /s with the following properties. 

• fx and fi + fs take values in [—1, 1]. 

• fi(x) is given by a sum of the form £\ XiU ni ^ x \ where each 7Tj is a polynomial on 
F" of degree at most k and Yli = M < Mq . 

• \\f 2 \\u^<v{M). 

• ll/ 3 ||2<6. 

Proof. Let $ be the set of all functions ±(uj 7t{x) + uj~^)/2 such that vr : F™ F p is 
a polynomial of degree at most k. Then 1 £ <fr, <fr = — $, ||0||oo < 1 f° r every <p £ $, 
and the linear span of is R F f>. (The last of these statements follows from the fact that 
$ contains all the characters on F™.) Furthermore, every product of functions in $ is a 
convex combination of phase functions u n ^ of degree at most k. 

Let / be an arbitrary function from F" to M. Lemma 12.51 implies that if = ±(0;^ + 
u-*)/2 £ $, then ||0||^ fc+1 < (||^||^ +1 + ||u;^ 7r ||^ fc+1 )/2 = 1, from which it follows that 
(/) 4>) — \\f\\u k+1 i so the second assumption of Theorem 12.31 holds with |.| = H-H^jt+i. 

The inverse theorem, Theorem 12.21 tells us that if ||/||oo < 1 and ||/||yk+i > e then there 
is a polynomial phase function u n of degree at most k such that |(/, <f>)\ > c(e). If / is real, 
then (/, cu""} = (f,u~ 7r ), so setting <p = (u T + cu~ n )/2, we have £ $ and |(/, <j>)\ > c(e). 
By changing sign if necessary, we can then find <fi £ $ such that (/, <fi) > c(e), which proves 
the third assumption. 

Since the hypotheses of Theorem 12.31 hold, we may deduce that if / takes values in the 
interval [0, 1], then it can be decomposed as a sum fi + f 2 + fs with the properties given 
to us by that theorem. Since each £ $ is an average of two polynomial phase functions 
of degree at most k, this is exactly what we want apart from the fact that we are trying 
to prove a theorem about functions that take values in [—1, 1] rather than [0, 1]. But to 
remedy this all we have to do is start with a function / that takes values in [—1, 1] and 
apply the above argument to (1 + /)/2. Once we have expressed that as f\ + fi + ^3, we 
know that / = (2f\ — 1) + 2/2 + 2/3, which is of the required form (with different constants, 
but that can of course be dealt with by replacing e by e/2, r](M) by rj(2M + l)/2 and the 
output M by 2M + 1 in Theorem □ 
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3. Basic properties of multilinear forms 

In order to be able to make use of our decomposition theorem in the preceding section, 
we need to establish some basic properties of polynomial phase functions. In particular, 
we must develop a useful definition of the rank of a polynomial phase function. 

For a quadratic phase function co q ^ there is a standard way of proceeding: one defines 
a bilinear form /3(x,y) = q(x + y) — q(x) — q(y) + q(0) on F™, and then one takes the 
rank of that bilinear form. (We put in q(0) so that we do not have to assume that q is 
homogeneous.) 

As we commented earlier, this is less straightforward for higher-degree polynomials, for 
the simple reason that there is no single obviously best definition of the rank of a multilinear 
form. Several definitions have been considered in the literature, and they have different 
advantages and disadvantages. In this paper, we sidestep the problem as follows. In the 
quadratic case, we made use of the following lemma, which we briefly state and prove to 
help with the discussion. 

Lemma 3.1. Let q be a quadratic form on F™ of rank r. Then \E x u g ^\ < p~ r l 2 . 
Proof. We use a simple and standard technique for estimating Gauss sums. 

\E x u g{x) \ 2 = E Xyy u q{x) ~ q{y) = E XyU u q{x) ~ q{x+u) = W JX)U uj- p{x ' u) - q{u)+q(S) \ 

where (3 is the bilinear form associated with q. For any fixed u, the expectation E x u^^ x,u ' 
is unless (3(x,u) = for every x, in which case it is 1. But the space of u such that 
(3(x, u) = for every x has codimension equal to the rank of /3, so the density of this space 
is p ~ r . It follows that \E x u q{x) \ 2 < p~ r , which proves the result. □ 

In the absence of a clearly analogous definition of the rank of a multilinear form, we 
simply define it in such a way as to make the obvious generalization of the above proof 
work. (We adopt a similar strategy in |GW09cj in order to deal with generalized quadratic 
phase functions on Zjy, where no algebraic definition of rank appears to be of any use to 
us.) 

We simultaneously define the rank of n and of k to be — \og p Eh lt ... t h d ^ K ^ hl '"'' hd ^ ■ Note 
that the quantity Eh 1 ,...,h d w R ^ hl '"' ,hi ' has a natural interpretation: for each (/12, . . . , hd) the 
expectation over h\ is 1 if . . . , hd) is constant as h± varies (since by multilinearity 
this constant must be 0) and otherwise. Therefore, as in the case when d = 2, we can 
think of E/ ll u '" as the density of the "kernel" of k. The big difference is that 
this "kernel" is not a subspace of anything. Rather, it is a strange subset of (F") d_1 and 
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its density does not have to be a negative integer power of p (so the rank is not usually 
a positive integer). It is for this reason that we refer to this definition as an "analytic" 
definition of rank rather than an algebraic one. 

The idea of defining rank analytically is one of the main ideas of this paper. On its own, 
it may not seem like much of an idea, since all we are doing is turning the conclusion of 
a lemma we would like to have about high-rank forms into a definition. The real point, 
which will become clearer later, is that we would expect to pay a heavy price for this when 
it comes to dealing with low-rank forms. But in fact we have ways of dealing with those 
as well, so we end up with reasonably clean proofs and do not have to delve too deeply 
into the structure of low-rank polynomials. (However, if we had needed such results, then 
we might well have been able to make use of a recent theorem of Green and Tao |GrT07] . 
which says that if a polynomial phase has low rank in our sense, then it can be made out 
of a bounded number of polynomial phases of lower degree.) 

Let us go back to the easy task of checking that the analogue of Lemma 13.11 for higher- 
degree polynomials does indeed hold with our definition of rank. 



Lemma 3.2. Let it be a polynomial on F™ of degree d and rank r. Then 



Proof. As is well known, the U k norms of a function increase with k. This remains true 
even when one allows k to equal 1, in which case we define H/Ht/ 1 to be |E x /(x)|. This is 
in fact only a seminorm, but it is still the case that < ||/||c/ 2 - Therefore, |E x Ci; 7r ^^ | 

is at most the U d ~ x norm of a/ 71 ". But 

ee{o,i} d ^ 1 

and 

K{x,h 1 ,...,h dr . 1 )= (-l^'^M) - Yl (-^'^(x + e./i), 

e€{0,l} d - 1 eG{0,l} d " 1 

by Lemma [2.41 with x = and (hi,..., ha) replaced by (x, hi, . . . , ha-i). Therefore, 
X^e{o i}d-i(—l) d ~^Tr(x + e.h) is equal to — k(x, hi, ... , hd-i) plus a function that depends 
on hi, ... , hd-i only. By this fact, the remarks following the definition of rank, and the 
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fact that the factor (— does not affect the modulus in the penultimate line below, 

= E hu _ hd jE x u< x ' h ^-' h ^\ 
= P~ r , 

which proves the result. □ 

The final lemma in this section is a simple example of a statement that might at first 
appear to demand some knowledge of the structure of low-rank polynomials (which is what 
we used in the quadratic case) but that can in fact be given a straightforward analytic proof. 
It gives us a very useful dichotomy for degree-<i phase functions: either they have small U d 
norm or they have not too large (U d )* norm. Moreover, which of these is the case depends 
only on the rank of the polynomial. 

Lemma 3.3. Let it be a polynomial of degree d and rank r. Then 

II 7TM —r/2 d j II 7TII* r/2 d 

\\ UJ \\U d= P fln " ll W \\lJd = P ■ 

Proof. The evaluation of the U norm follows from Lemma 12.41 and the definition of rank. 
Indeed, that lemma tells us that for all igP and y G (F™) , we have the identity 

e£{0,l} d 

where k is the symmetric multilinear form associated with the polynomial it. It follows 
that 

II, ,7r 1 1 2 d jw< , ,K,(y) _— r 

\\UJ \\ ud - !iL yG ( ¥ n)dbJ -p . 

For the dual norm, given a function / : — > C, let us define the nonlinear operator 
D 2 d_if to be the function whose value at x is 

E ^(F?) d II C^f(x + e.y). 

e6{0,l} d \0 

Now for any function g : ' — > C, the definition of rank gives us that 

By the generalized Cauchy-Schwarz inequality for the uniformity norms and the definition 
Of Dnd_ _i, we find that \(g,uj n ) \ is bounded above by p r llgll^dllu; 71 "!!^ l . By the first part of 
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the lemma, |(<7, a;*")! is at most p r \\g\\u d P~ r( ' 2d ~ 1 ^ 2d = \\g\\u d P r ^ 2 ■ It follows that Ho;""!!^ < 
f' 2d . □ 

4. A DECOMPOSITION INTO HIGH-RANK POLYNOMIAL PHASE FUNCTIONS 

In this section we shall prove a decomposition theorem that is similar to Corollary 12. 6[ 
but with two important differences. The first is that we shall split a function up into 
polynomial phase functions that do not all have the same degree. The second, which 
is central to our entire argument, is that we need the ranks of these polynomial phase 
functions to be large. Precisely how large is a complicated matter: we have a series of 
parameters and it is essential to understand how they depend on each other when it comes 
to applying the theorem later. 

To begin with, we shall ignore the ranks, and obtain a preliminary decomposition by 
simply iterating Corollary l2.6l In the statement of the theorem, we make a slightly artificial 
distinction when we discuss what various functions depend on. Given a function / of two 
variables x and y, it is sometimes convenient to rewrite f(x,y) as f x (y) and think of it 
as a function of y that depends on x. And then, if x is clear from the context, one may 
even suppress the dependence on x in the notation. For instance, if one is proving a 
statement of the form, "For every x there is a function / : R — > M such that then 
one could regard this as a proof of the existence of a function F of two variables (x and 
a real number). These considerations apply to the quantities M^o below, which can be 
thought of as constants that depend on several real variables, a function rji, and a small 
real parameter e, or as functions of real variables that depend on r]i and e, or as functions 
of several variables, some of which are large reals, one of which is itself a function, and one 
of which is a small real. We highlight this matter here, because it is very important for 
our proof that we do not accidentally have a directed cycle of dependences, and it is not 
particularly easy to keep track of whether we have done so. 

Theorem 4.1. Let s and k be positive integers with s < k and let e > 0. For each i from s 
to k let r/i be a function from W+ s+1 to M + that is strictly decreasing in each variable. Let 
f be a function on W2 that takes values in the interval [—1,1]. Then there are functions 
M Sj o, • • • , Mfc j0 ; with Mj : W+ s — > R + a function that is increasing in each variable (and 
that depends on e and the function r]i ), and a decomposition 

f = fs H h fk + 9k + h s H h h k 

with the following properties for every i between s and k. 
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• We can write fa = Y^j XijU)'**'* > where the functions mj are polynomials of degree i 
and the A^j are real coefficients with J2j = Mi < M it0 (M s , . . . , Mj_i). 

• Let gi = f i+ i H h fk + 9k + h i+1 H \-h k . Then \\gi\\u*+i < Vi(M s , Mi) for 

each i. 

• The functions fa and fa + hi take values in [— 2*~ s , 2 l ~ s ] and gi takes values in 
[-2 i + l ~ s ,2 i+1 ~ s \. 

• We have the estimate \\hiW2 < 2 s_i ~ 1 e. 

Proof. We prove this by induction on k. The base case, when k — s, is precisely Corollary 
12.61 (with e replaced by e/2), with the additional trivial observation that if / = f s + g s + h s 
and / and / s + h s all take values in [—1, 1], then g s takes values in [—2, 2]). So now let us 
assume that we have the result for k and let us prove it for k + 1. 

To do this, we simply apply Corollary 12.61 to the function g k , or more precisely to the 
function 2 s ~ k ~ 1 g k , which takes values in [—1, 1]. Applying it with k + 1 instead of k and e 
replaced by 2 2s ~ 2k ~ 3 e and rj replaced by the function M 1— >■ 2 s_fc_1 ?7 fc+1 (M s , . . . , M k , M), and 
then multiplying everything by 2 k+1 ~ s , we find that we can write g k as ^ . Afc + i J CJ 7rfc+1J + 
+ /ifc+i, with the following properties. 

• |Afc+i,j| = M k +i is bounded above by a constant Mfc +10 that depends on e and 
the function M ^ 2 s - fc " 1 r/ fc+1 (M s , . . . , M k , M). 

• llflfc+illt/H- 2 < %+i(M s , . . . ,M k ,M k+1 ). 

• fk+i and f k+ \ + h k+ i take values in [— 2 fc+1 ~ s , 2 fe+1_s ], and therefore takes values 
in [_2 fc + 2 - s ,2 fe+2 ^]. 

• \\h k+ ih<2 s ~ k ~ 2 e. 

This almost completes the proof, but it remains to check that M k+ i jQ depends just on 
M s , . . . ,M k , e and r) k+ x. This is true since it depends just on e and the function M i-> 
2 s ~ k ~ 1 r) k+ i(M s , . . . , M k , M), and that function depends on M s , . . . , M k and rj k+ i only. □ 

The next step is to prove a result that can be used as a tool for eliminating polynomial 
phase functions of low rank. 

Proposition 4.2. Let e > and M be constants, and let 77 be a constant such that < 
r) < e 2 /M. Then for every positive real number R there is a constant c = c(e, R, M) with 
the following property. Let f : F™ — > R be a function such that \\f\\u m — c and suppose that 
we have a decomposition f = ■ AjW 71 " 3 + g + h such that the functions irj are polynomials 
of degree m, |Aj| = M, \\g\\i/ m +i < t], and \\h\\2 < e. Then there is also a decomposition 
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/ = ^'j^ 7Tj + g + h" such that the ir'j are polynomials of degree m and rank at least R, 
J2j \ — M> \W\U — an d g is the same function as before. 

Proof. Our approach is a natural one: if ||/||t/ m is very small, then it has hardly any 
correlation with a low-rank degree-m phase function, so we would not expect such functions 
to play an important role in the decomposition. And indeed, we shall show that the L 2 
norm of the "low-rank part" of the decomposition is small enough for us to be able to 
absorb that part into the L? error term h. 

First, we need to identify the "low-rank part". To do this, we choose t such that 
M 2 p~ t = e 2 , and we find a "rank gap" of length t; that is, we find a number R\> R such 
that 

Y^{\\\-Rx<r{n)<Rx + t}<€, 

where we have written r(7Tj) to stand for the rank of 7r». We know that Y2i M < so we 
must be able to find such an Ri with Ri < R + tM/e. 

Let L = {i : r{jii) < Ri} and H — {i : rfc) > R\ + 1}. (These letters stand for "low" 
and "high", respectively.) Then we can write 

and ||/a/|| 2 < H/wlloo < e. Let f L = Y.ieL^'* 1 and fa = Y,ieH so that / nas a 

decomposition of the form fi + fH + g + h', where fi is made out of functions u n with ir of 
rank at most Ri, fu is made out of such functions with tt of rank at least R\ +t, h! = h+fw 
has L 2 norm at most 2e, and < rj. Clearly we also have YlieH M — 

We would like to show, using the hypothesis that / is highly uniform of degree m, that 
II /l || 2 is very small, so that fi can be incorporated into the L 2 error. To do this, let us 
bound \\f L \\l = (f L ,fa) above by 

\(fL,f)\ + \(fL,fa)\ + \(fL,g)\ + \(fa,h')\. 

and consider each of the terms on the right-hand side in turn. 

First, we bound |(/l,/)| above by ||/L||{/m||/||[/ m - But by Lemma 1331 H/iH^m < Mp Rl , 
so we must choose c to satisfy cMp Rl < cMp R+tM / e < e 2 , then \ f) \ < e 2 - Note that t 
was chosen in terms of M and e, so c will be bounded in terms of R, M and e. 

Next, we consider |(/t,,/h)| < ||/z,||u-m||/if||f/ m and use the fact that 1 1 1 1 c/^ < Mp~( Rl+t \ 
again by Lemma [3.31 Since H/iHf/m < Mp Rl , this gives us the bound |(/z,,/#)| < 
j^2pR 1 -(Ri+t) _ M^p-t^ w hich is at most e 2 by our choice of t. 
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The next term, \{fL,g)\, is bounded above by ||/i|||ym+i||^||{7 m + 1 - Since a degree-m 
polynomial phase function has (JJ m+1 )* norm 1, by Lemma 12.51 the triangle inequality 
tells us that ||/z||^ m+ i < M, and the initial decomposition gave us the bound ||5 , ||a m + 1 < V- 
Since we have insisted that 77 < e 2 /M, we deduce that \(fL,g) \ < e 2 - 

Finally, we have that \{fi, h')\ < 2e||/i||2- The upshot of all these computations is that 
WfiWl < 3e 2 + 2e||/ L || 2 , which implies that ||/ £ || 2 < 3e. 

So provided that ||/||;7 m < c = c(e, M, R), we have successfully decomposed / as 

i 

where the 7Tj are polynomials of degree m and rank at least R, we have set h" = h+ /a/ + Jl, 
and we have the bounds ^ |Aj| < M, < r), and \\h"\\2 < 5e. □ 

We now apply Proposition 14.21 iteratively to the decomposition obtained in Theorem 14.11 
in order to make all the polynomial phase functions have high rank and thereby prove our 
main theorem. 

Theorem 4.3. Let s and k be positive integers with s < k, let e > 0, and let 
M + be a function that is strictly decreasing in each variable. Let R s , . . . , R^ be functions 
from M^T S+1 to K + that are strictly increasing in each variable. Then there are functions 
M Sj o, • • • ,Mfc j0 ; where M i is a function from W+ s to M + (that depends on e, rj and the 
functions Ri, . . . , Rk) and a constant d = c'(e, rj, R s , . . . , Rk) > 0, such that if f is any 
function that takes values in [—1, 1] and satisfies \\f\\u a < c' , then there are real numbers 
M s , . . . , M k and a decomposition 

f = f's + --- + f' k + 9 + h 

with the following properties. 

• We can write f[ = . A« jU 7Ti J ' , where the functions TTij are polynomials of degree i 
and the \j are real coefficients with . |Aj )3 -| = Mj < M it0 (M s , . . . , Mj_i). 

• For each i, each polynomial tt^j has rank at least Ri(M s , . . . , Mk). 

• \\g\\uw <v(M s ,...,M k ). 

• \\hh<e. 

Proof. We begin by applying Theorem 14.11 For that we shall need to specify the functions 
T] s , . . . ,r]k- We shall do that soon, but for now let us simply apply it for some general 
functions rji (bearing in mind that rji is a function of the variables M s , . . . , Mj). 
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Let f s +- ■ ■ + fk+9k+h s +- ■ -+hk be the decomposition that results. We begin by isolating 
the function g^-i = fk + 9k + hk from this, about which we know that fk = J2j ^kjU 1Tk < 3 , 
where the functions iikj are polynomials of degree k, \\kj\ = Mk < Mfc )0 (M s , . . . , Mfe_i) 
and Hs'fellc/fe+i < r]k(M s , . . . , Mk). Moreover, we know that gk-i, fk and fk + hk take values 
in [-2 k ~ s ,2 k ~% g k takes values in [-2 fc+1 ~ s , 2 k+l ~% \\h k \\ 2 < 2 s ~ k - 1 e and \\g k -i\\u* < 
%_i(M s , . . . , M fc _i). 

We are already in a position to specify we take rjk(M s , . . . , Mk) to be the minimum of 
T](M S , . . . , Mfc) and 2 2 ^ _fc_1 ^e 2 /Mfc. We shall also take g to be g^, so we have the estimate 
Ugllfyfc+i < r](M s , . . . , Mk), which will give us what we want provided that we do not increase 
any of the Mi when we find our new high-rank decomposition. 

Now let us suppose that we have chosen the functions 7]k,r]k-i, ■ ■ ■ ,r]i- We shall choose 
?7i_i as follows. First, apply Proposition l4.2l to the function g^\ = fi+gi+hi with e replaced 
by 2 s ~ i ~ 1 e and with R = Ri(M s , M H , N it . . . , N k ), where N t = M ifl (M s , M;_i) and 
N h = M hfi (M s , Mi_i, Ni,..., N h _i) for each h from i + 1 to k. Note that N h > M h 
for each h. (Note also that Nh has a dependence on i, but we are regarding i as fixed 
and suppressing that dependence.) That tells us that if ||<7i_i||f7» < c(2 s ~ l ~ 1 e, R, N) 
and ||5'j||(7 4 + 1 < 2 2 ^~ l_1 ^e 2 /Mj, then we can split gi-\ up as // + gi + h\, where f[ = 
X'ijUJ^-i for some degree-i polynomials 7r^- of rank at least R, J2j\K.j\ = M[ < 
M ii0 (M s , . . . , Mj_i), and ||^||2 < 5.2 s ~ i ~ 1 e. So let us choose our function r/j.i in such 
a way that ?7j_i(M s , . . . , Mj_i) < c(2 s_l ~ 1 e, i?, A^), and in order to get the next stage to 
work, let us also insist that 7]i-i( M s, • • • , Af 4 _i) < 2 2 ( s " i )e 2 /M i _ 1 . 

We are now ready to choose our constant c'. This we do by simply continuing the above 
procedure for one more step. That is, we think of / as g s -i, and we define a "function" 
Vs-i by setting i = s in the above paragraph. However, r/^! no longer depends on any 
variables, which is what we want, since we are trying to define a constant. To be slightly 
more explicit, we define R to be R S (N S , . . . ,Nk), where the Nh are defined as above (with 
i = s), and we choose d to be c(e/2,R). 

We should point out that we have very carefully (and only just) avoided a circular 
dependence of parameters in the previous two paragraphs: we chose the function rji^i to 
be bounded above by a function of e and R; R in turn depends on M s , . . . , Mi_i and 
Ni, . . . , Nk] but Ni, . . . ,Nk depend on M s , . . . , Mj_i, e, and the functions rji, . . . , rjk, which 
we have already chosen. Thus, once we know M s , . . . , Mj_i, e and the functions rji, . . . , rjk, 
we can determine Ni, . . . , N s , then R, and finally (M s , . . . , Mj_i). 
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We are almost finished. Proposition 14.21 guarantees that M[ < Mi for each i. Since 
the functions Ri are increasing in each variable, we have guaranteed that the rank of each 
polynomial 7r^ is at least Ri(M' s , . . . , M' k ), as required. Similarly, ||<7||[/m-i < f]{M' s) . . . , M' k ). 

Finally, setting h = h s + h h k , we have \\h\\ 2 < 5 ^f =s 2 i " s " 1 e < 5.2 k ~ s e. Obviously 

we can get rid of the factor 5.2 k ~ s by applying the above argument with e replaced by 
e/5.2 k ~ s . □ 

In our application, we shall actually use a slightly simpler statement that follows imme- 
diately from the previous theorem. 

Corollary 4.4. Let s and k be positive integers with s < k, let e > 0, and let r\ : — > 

M + be a junction that is strictly decreasing in each variable. Let R be a function from 
M + to M + that is strictly increasing in each variable. Then there is a constant M , that 
depends on e, rj and the function R, and a constant c" = c"(e,r],R) > 0, such that if f 
is any function that takes values in [—1,1] and satisfies \\f\\u* — c" , then there is a real 
number M and a decomposition f = f + g + h with the following properties. 

• We can write f = Y2j ^j^ nj , where each function Hj is a polynomial of degree 
between s and k, and the Xj are real coefficients with Y2j — M < M . 

• For each j , each polynomial 7ij has rank at least R(M). 

• Ibllc/fe+i < tl(M). 

• \\hh<e. 

Proof. Let us apply Theorem 14.31 with all the functions Ri defined by Ri(M s , . . . ,Mk) = 

R(M S H h M k ) and with r}(M u ...,M k ) replaced by r](M 1 + ■■■ + M k ). Now define a 

sequence N s , . . . ,N k by taking N s = M Sj0 (where M S)0 is as given to us by Theorem 14.31) . 
and in general N i+ i = M i+ ifi(N s , . . . , Ni). Then for each i, Ni is an upper bound for how 
large M ii0 can possibly be. Therefore, if we take M to be N s + • • • + N k , then we obtain the 
first property from the corresponding property in Theorem 14. 3[ with M = M s + • • • + M k . 
The remaining three properties follow immediately from their previous counterparts. □ 

5. Degree-s independent systems of linear forms 

So far, we have made no use of the condition that we are dealing with linear forms 
Li, ... , L m that are degree-s independent. Recall that a linear system L\, . . . , L m was said 
to be degree-s independent if the functions Lf , . . . , L s m are linearly independent, where we 
view the linear forms L\, . . . , L m as defined on F p . 



LINEAR FORMS AND HIGHER-DEGREE UNIFORMITY FOR FUNCTIONS ON F™ 21 

In this section we shall collect together some facts that will be needed when we come 
to apply Theorem 14.31 in order to estimate expressions of the form K x Yl™ =1 f(Li(x)). We 
begin by showing that if the linear forms L±, . . . , L m are degree-s independent, then they 
are degree-t independent for all t > s (as long as p is sufficiently large). This is not a 
surprising observation, but it will be very important to us later. 

Note that in the next lemma that our linear forms are functions from (F p ) d to ¥ p and 
that elements of (¥ p ) d . 

Lemma 5.1. Let s be a positive integer and let L\, . . . , L m be linear forms in d variables 
that take values in ¥ p . Suppose also that p > s. Then the degree-s forms L\ are linearly 
independent if and only if the s-linear forms (x\, . . . , x s ) i— > Li(x\) . . . Li(x s ) are linearly 
independent. 

Proof. Let a = (ai,...,a s ) be an s-tuple of elements of ¥ p . We shall use the identity 
^ eg | 1 i s (— l) s ~^(e.a) s = slai . . . a s , which can easily be proved by induction (in a similar 
manner to some of the results in Section 3 of this paper). 

Suppose, then, that J2i ^iLi(x) s = for every x G (¥ p ) d . Then if we choose elements 
xi, . . . , x s of (F p ) d , we know that ^ Aj(Lj . ejXj) s = for every e G {0, 1} S . Using the 
linearity of the L{ and the identity, we deduce that 

s\ y £ t \ i L i {x 1 )...L i (x.) = y £ t X i (- 1 ) S ~ lel ( J E e M^)) S 

i i eS{0,l} s j 

= E (-ir |e| E A <^E^) s 

ee{0,l} s i j 

= 0. 



Since p > s, we know that s! ^ 0, so if the s-linear forms (Lj(xi) . . . Lj(x s ))™ 1 are 
linearly independent, then all the Aj must be 0. This implies that the functions L\ are 
linearly independent. 

The other direction is trivial, since Li(x) s is just Lj(xi) . . .Li(x s ) with all the Xj equal 
to x. □ 

Lemma 5.2. Let s be a positive integer. If a system Li, . . . , L m of linear forms is degree-s 
independent, then it is degree-t independent for all integers t such that s < t < p. 
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Proof. By Lemma 15.11 it is enough to prove the result for the s-linear and t-linear forms 
defined there instead. So let us suppose that we have Ai, . . . , X r such that 

m 

^2 \Li(xi) • ■■L i (x t ) = 

for every x\, . . . , x t G ¥ p . If the A.; are not all then we can find x £F p such that not all 
the \Li(x) are zero. For such an x, let /ij = AjLj(a;)' _s for each i and observe that the 
are not all zero. Then 

m m 



t-s 

/ J M-ui^lJ ■ ■ ■ ^iK-^sJ — " 
i=l i=l 



HiLi(xi) . . . Li(x s ) = \ A i L i (x 1 ) . . . Li(x s )Li(x) 3 = 



for every x\, . . . , x s G F p . This is a contradiction if the s-linear forms are linearly indepen- 
dent, so the lemma is proved. □ 

Before we make use of degree-s independence, we need to prove some more lemmas about 
the behaviour of multilinear forms, this time under the additional assumption that they 
have high rank. We first need to establish that u K behaves like a quasirandom a function 
whenever k is a high-rank symmetric multilinear form. Before we do this, we prove a 
simple lemma which we will use in the proof. 

Lemma 5.3. Let d > 2 and let k be a homogeneous d-linear form on F" of rank r. For 
each Xd let r(x^) be the rank of the (d — 1) -linear form (xi, . . . , Xd-i) k(xi, . . . , Xd-i,Xd)- 
Then p~ r = E Xd p~ r ^ . 

Proof. Recall that if r is the rank of a homogeneous ci-linear form k, then p~ r is equal to 
the density of the set of (x2, • • • , Xd) such that k(x, X2, ■ ■ ■ , Xd) = for every x. The result 
follows immediately, provided that when d = 2 we interpret the rank r of a 1-linear form 
to be if it is identically zero and oo otherwise (so that p~ r is 1 or 0, respectively). □ 

Lemma 5.4. Let k(x\, . . . , Xd) be a symmetric d-linear form on F" of rank at least r. For 
each I C [d], let fj be a function on (F") d that depends only on those X{ with i £ I , and 
suppose that ||//||oo is at most 1. Then 



IG[d] 



< p 



-r/2 d 



Proof. The proof is a standard application of the Cauchy-Schwarz inequality. For x G (F™) 6 * 
and any proper subset Id [s], denote the |/|-tuple (xi) ie i by xj. Note that the functions 
fj below take variables indexed by J only and are allowed to change from line to line. 
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We shall proceed by induction on d. The case d — 1 follows from the definition of rank 
(and the case d = 2 was proved in |GW09bj ). so let us assume that the result is true for 
d — 1. Fix an index i G [d\. Without loss of generality we may assume that this index is d. 
By the triangle and the Cauchy-Schwarz inequality we have the bound 



E. 6 ( F «r " K(x) II 

IC[d] 



< E 



x [d _ 1] e(^) d - 1 



JC[d-l] 



X 



and expanding out the inner square yields 



E, 



K{X[ d _ ±] ,X d -x' d ) 



fju{d} (xj, x d )f JU{d} (x j, x' d ) . 



JC[d-l] 

Let us now write g J>Xd>x > d (x) = fju{d}(xj,x d )f JU{d} (xj,x' d ). What we have shown is that 

2 



I<Z[d] 



< ^x d ,x> d eF% 



^ [d _ 1]6 (F F )^ u K{x ^' Xd - x '* ] II 9J,x d ,x' d (x) 

JC[d-l] 



Now for each x d , x' d the function X[d-i] K(x[d~i], x d — x' d ) is a (d — l)-linear form of rank 
r(x d — x' d ). By the inductive hypothesis, the inner expectation has modulus bounded above 
by p~ r (- Xd ~ x 'd>/ 2d 2 . Therefore, the right-hand side is bounded above by 



E , v ~r(x d -x' d )/2 d - 2 



d-2 r /n d -2 
= P ' , 



where for the last equality we used Lemma 15.31 The result follows on taking square 
roots. □ 



We now turn to a simultaneous generalization of Lemma 15.41 and Lemma 13 .21 Lemma 
13.21 is about the behaviour of polynomials n : — > ¥ p of degree d, while Lemma 15.41 is 
about c?-linear functions k : (F™) a! — > W p . If we just consider homogeneous polynomials, 
then these are at opposite ends of a spectrum of monomials of degree d: the polynomials 
7r involve the smallest possible number of variables and the <i-linear functions involve the 
largest possible number (1 and d, respectively). Now we want to look at the cases in 
between. For example, if k is trilinear, then we will want to look at functions of the form 
k(x, x, x), which is a general homogeneous cubic, n(x, y, z), which is trilinear, and also the 
intermediate y), which depends quadratically on x and linearly on y. 

It will help to have a way of representing a general polynomial in d variables that range 
over Fp. Let us start with monomials. If our d variables then a monomial of 

degree s is obtained by taking an s-tuple . . . , i s ) G [d] s with %\ < ■ ■ ■ < i 3 and defining a 
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function fi(xi, . . . , Xd) = ^(x^, x i2 , . . . , x is ), where n = k(u±, . . . , u s ) is some s-linear form. 
Moreover, provided that p > s we shall assume that if i r = ■ ■ • = it then k is symmetric in 
the variables u r , . . . ,Ut, since we can just average over all permutations of u r , . . . ,Ut- This 
means k is the unique s-linear form giving rise to the monomial fi. We define the rank of 
fi to be the rank of n. A polynomial of degree s in the variables x\, . . . , Xd (each of which 
takes values in F") is defined to be a sum of monomials of degree at most s, at least one 
of which has degree s. 

A sequence (ii, . . . ,i s ) G [d] s with i\ < ■ • ■ < i s can be thought of as a multisubset of 
[d] of size s. If this multiset is V, then we shall write V for the underlying set {i\, . . . , i s } 
(which in general will have cardinality less than s since not all of i\, . . . , i s will be distinct). 
If fi(xi, . . . ,Xd) = nfa^jXiz, . . . ,Xi s ), then we shall say that V = (it, . . . ,i s ) is the index of 
yU. The multiplicity of an element j G V, which we shall also refer to as an element of V, 
will be defined to be the number of h such that i^ = j, and we shall write \V\ for the size 
of V (which we have defined to be s, which is the sum of the multiplicities of the elements 
of V). If x = (xi, . . . , Xd) is a c?-tuple of elements of F^, then we shall write xy for the 
|V|-tuple (x h , ...,x is ). 

If / is any function from (F™^ to ¥ p , i G [d], and y G F™, we write yei for the element 
of (F") d which is y in the ith place and zero everywhere else, and we write d y ^f for the 
function x \-t f(x) — f(x — yei). Finally, if V is a multisubset of [d] and i G [d], then we 
write V \{i} for the multisubset W that is the same as V except that if i has non-zero 
multiplicity a in V then it has multiplicity a — 1 in W. For example, if V — (1, 2, 2, 4), then 
V \ {2} = (1, 2, 4) and V \ {3} = (1, 2, 2, 4). We shall also write U C V if the multiplicity 
of every element of U is at most its multiplicity in V. So for example, the multisubsets of 
(1,2,2) are (), (1), (2), (1,2), (2,2) and (1,2,2). 

Lemma 5.5. Let d and s be positive integers, let V = (i\, . . . , i s ) be a multisubset V of [d] 
of size s, and let Ky be a \V\-linear function of rank r. For each x = (x\, . . . , Xd) G (F™) d 
define /iy(x) to be the monomial /^(x^, . . . , XjJ. Then for any fixed y the function dy^fiy 
is a polynomial made up of monomials vw of index W with W C V \ {i}. Moreover, if 
r{y) is the rank of the monomial uv\{i\ i n this polynomial, then E, y p~ r ^ = p~ r . 

Proof. This we can prove by a direct calculation. For ease of notation, we shall prove it 
just in the case i = d but of course the same argument works for general i. If d ^ V, 
then /iv does not depend on Xd and d y ^^v — 0, so the result is trivial. Otherwise, let us 
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suppose that d belongs to V with multiplicity t. Then we have 
d y ,dVv{%i, ■ ■ ■ , Xd) = Mv(#i, ...,x d )- fi V {x 1 , ...,x d -y) 

Ky ( K Xi 1 , . . . , Xi s _ t , X d , . . . , X(i) i.Xi 1 , . . . , Xi s _ t , x d y,..., r y) . 

If we expand out this last expression using multilinearity, then we have a linear combination 
of terms of the form nv(xh, ■ ■ ■ > u i> • • • > where each «j is equal to either x d or ?/. 
The term where every Uj is equal to x d has a coefficient of zero, and the other terms are 
the values of monomials of index W with W C V \ {d}. This proves the first part of the 
lemma. 

We now turn to the assertion about ranks. Since k is symmetric, we have the formula 

vv\{d}{x\, ...,x d )= tK V (x hj . . . : x is _ t ,x d: . . . ,x d ,y), 

where x d is repeated t—1 times. The right-hand side is equal to the value of the (s — l)-linear 
form A y : (u u . . . ,w s _i) ^ tK V {u x , . . . ,u 8 - U y) at the point (x h , . . .,x ia _ t ,x d , . . .,x d ). Now 
the forms \ y are non-zero multiples of the restrictions of Ky that are obtained by setting 
the final variable equal to y. Therefore, the assertion we wish to prove follows straight 
from Lemma 15.31 and the fact that multiplying by a non-zero scalar does not change the 
rank of a multilinear form. □ 

Lemma 5.6. Let it = 7r(xi, . . . ,x d ) be a polynomial in d variables and suppose that it = 
^2 V£V fiv> where V is a collection of multisubsets of [d] and each fiy is a monomial of index 
V. Let U be a maximal element ofV (meaning that ifV^V and U C V then U = V), let 
s = \U\ and let r be the rank of flu . Then |E a . e ( I r V )ciu;' r ( x )| <p~ r / 2S \ 

Proof. If |J7| = s, then the function \x\j is s-linear. (Recall that U is the underlying set of 
the multiset U.) In this case the result follows easily from Lemma 15.41 Indeed, without 
loss of generality fi v depends on ii,...,i s . Then if we fix x s+ i, . . . , x d , we find that 

IC[s] 

where 

fi(x 1 ,..., Xd )= n 

Wc[d],WU/€V\{l7} 

Since x s+1 , . . . ,x d are fixed, fj depends just on the variables Xi with % G I. Moreover, 
since U is maximal, fj = liil=[s], since then there is no W with W U / G V and 
WU I U. Therefore, Lemma T5.4I gives us an upper bound of p~ r / 2S 1 . If we average over 
all possibilities for x s+ i, . . . , x d , then the result follows. 



E 



XI, 
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Now let us suppose that at least one element of U has multiplicity greater than 1. 
Without loss of generality that element is d, so \xy has a nonlinear polynomial dependence 
on Xd- We shall apply the Cauchy-Schwarz inequality in the usual way: 



E 



xG(F") d 



< E, 



[d-1] 



where we have written x' for the rf-tuple (x\, . . . , Xd-i, x' d ). 

Now if we set yd = x d~ x' d , then Hv( x ) ~ ^v( x> ) — d y ^v{ x )- Therefore, we can rewrite 
the last expression above as E yd E x u dv - dflv ^ x \ By Lemma |5~5| for each fixed y, each function 
d y ,dHv is a linear combination of monomials of index V \ {d} if d G V and is otherwise. 

Since d has multiplicity greater than 1, it follows that U \ {d} is a maximal element 
of the multiset system {V \ {d} : V G V}. Indeed, if U \ {d} C V \ {d}, then d must 
belong to V, from which it follows that U C V and therefore that U — V, and finally that 
U\{d} = V\ {d}. Therefore, by induction on s, we find that lE^^.^^I < p-^)/ 2 * -2 , 
where r(yd) is as defined in Lemma [5.51 It follows that 



\E ya E x uj d y^ v{x) \ < E^IE^^'^^I < E yd p- r{Vd)/2 



by Lemma 



Once again, the result follows on taking square roots 



P 



-r/2 s 



□ 



While the analytic definition we have chosen for the rank of a multilinear form is very 
convenient when it comes to evaluating exponential sums, it also has its disadvantages 
(as we also discovered in |GW09cj ). In particular, while it follows almost trivially from 
the algebraic definition of the rank of a quadratic form that the product of two low- 
rank quadratic phases again has low rank, one has to work to prove it from the analytic 
definition. However, it is still true, as Lemma 15.91 below shows. The next few statements 
are in preparation for that result. 

Lemma 5.7. Let fi be a d-linear form on F™ and let f(x\, . . . , Xd) be defined to be ^wwi— >s<0 _ 
Then 

/(ai, ...,a d )= C d ~ lel f(x 1 + e l a 1 ,x 2 + e 2 a 2 , ...,x d + e d a d ) 

e€{0,l} d 

for every a±, . . . , a d and x±, . . . ,x d in F" 

Proof. We prove this by induction on d. For any fixed u, the function that takes (ai, . . . , a^-i) 
to f(a±, . . . ,ad-i,u) is a (d — l)-linear phase function. Therefore, if the result is true for 
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d — 1, then for any fixed u G F™ 

f(ai, . . . , a d _i, u) = C d ~ 1_|e| /(xi + e x ax, . . . , x d „i + e d -xa d -i, u). 

ee{o,i} d ^ 1 

The multilinearity of p also implies that 



/(ai, . . . ,a d _i,a d ) = /(ai, . . . ,a d -i,x + a d )/(ai, . . . , a d _i,x). 

Applying the previous formula with it equal to x + and x and taking the product with 
the appropriate complex conjugation gives the result for d. The case d — 1 is trivial to 
verify. □ 

Lemma 5.8. Let f be a function from (Ep) d to C. Then \\fWu* > |E X1) ..., Xd f(xi, . . . ,x d )\. 

Proof. Again, we use Cauchy-Schwarz several times. The first time, it tells us that 
\E Xl _ x J(x u . . . ,x d )f < (E Xu ... !Xd jE x J(x u . . .,x d )\ 2 ) 2d " 

E x d ,y d f(Xl, • • • , x d )f(xi, • • • , X d - U y d )) 



< ®x d ,y d (^x 1 ,...,x d _ 1 f(Xi, . . • , X d )f(x 1: . . . , X d _i, 



2" 



-i 



For each x d and y d) let h Xdyyd (x u . . . ,x d _i) = /(xi, . . .,x d )f(xi, . . .,x d -i,y d ). Then the 

I 2 r l-i , ,1 , nlnn /"\ ml r\ 1 n W I I A I I 2 



quantity above is equal to E Xd>yd 11^,^11^-1, which also equals E XdAd \\h Xd , Xd+ad \\ l ud _ u which 



by the definition of the U d 1 and U d norms, is equal to ||/||? 7 d. □ 



If /i is a multilinear form, let us define ot(p) to be p = E Xlj ... tXd u) tl ( xlr "' Xd \ 

Lemma 5.9. Let p and v be d-linear forms on F™. Then 

r(p + u) < 2 d (r(p) +r(u)). 

Proof. Let f(x±, . . . ,x d ) = u}^ Xl '- ,x ^ and let g(x±, . . . ,x d ) = oj v ( Xl '- ,Xd \ Then 

a(p + v) = E air .. )0d /(ai, . . . , a d )g(a x , ...,a d ), 

which, by Lemma \5^7\ is equal to the expectation over a x , . . . , a d , x±, . . . , x d and yx, . . . , y d 
of 

Y[ C d -^f(x l + e 1 a 1 ,...,x d + e d a d ) JJ C d ~ M g(y 1 + rjia x , . . . , y d + rj d a d ). 

eG{0,l} d V&{0A} d 



28 W.T. GOWERS AND J. WOLF 

For each u u . . . , u d , define h Uu ... tUd {x x , ...,x d ) to be f{x u x d )g(x 1 + u u . . . , x d + u d ). 
Then we can rewrite this expectation as 

Eoi,...,a d Ea!i,...,a! d Eui,...,w d C d ~^h Uu ,„^ d {x\ + e&i, x 2 + e 2 a 2 , ...,x d + e d a d ), 

e£{0,l} d 

which is equal to E Ul) ... )U J/i ul) ... iU J^ J . 

This is at least (^ U i,...,u d \\h ui ,...,u d \\u d ) 2 i by Holder's inequality (or d applications of 
Cauchy-Schwarz), and by Lemma [5.81 that is at least 

(E uli ,,, jU( j [E.^^ h Ul t ,,, t Ud (j%i j • • • j <Ed) I ) — |E Ul t .. ^u^Ejjj t ,,, t Xdh"u,i ("^1 ) • • • ) -^d) I 

= lE^^.^E^,...^/^!, . . . , x d )g{yi, y d )\ 2d 



We have shown that a(fi + u) > (a(/^)o;(z/)) 2 , and the result follows on taking logs. □ 

Corollary 5.10. Let K\, . . . ,K m be d-linear forms on F™. Then 

r{Kx + --- + K m )< (2m) (i (r( Kl ) + • • • + r(K m )). 

Proof. We begin with the case where m = 2 h . We claim that in this case we have a stronger 
estimate in which the factor on the right-hand side is m d rather than (2m) d . We prove this 
by induction, noting that when h = 1 the statement is given to us by Lemma 15.91 
Suppose that we have proved it for all powers of 2 up to 2 h ~ 1 . Then by Lemma [5.91 

r(«i H h K m ) < 2 d (r(ni H h K m/2 ) + r(n m/2+1 H h K m )), 

and by the inductive hypothesis applied to the two terms this is at most 

2 d (m/2) d (r(K 1 ) + ■ ■ ■ + r{n m )) = m d (r(K 1 ) + ■■■ + r(« m )), 

which completes the inductive step. 

In general, since the rf-linear form that takes the value zero everywhere has rank zero, 
if m is not a power of 2, then we can add enough copies of the zero map to make it up to 
the next power of 2. This does not increase m by more than a factor of 2. The result is 
proved. □ 

Equipped with this knowledge about the rank of a sum of multilinear forms of degree d, 
we now prove that for any set of multilinear forms of high rank at least one "independent" 
linear combination of these multilinear forms must have fairly high rank. 
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Lemma 5.11. Let k 1; . . . , K m be multilinear forms of degree d, at least one of which has 
rank at least R. Let B be an invertible m x m matrix with entries by G F p . Then at least 
one of the multilinear forms rjj = YllLibij^i has rank at least R/(2m) d . 

Proof. Let «j have rank at least R. It follows from the assumption that B is invertible that 
Ki is a linear combination of the forms rjj . Write r m for the rank of rji , and let r = maxj r m . 
Then the rank of any linear combination of the r]i is at most (2m) d r, by Corollary 15.101 
The result follows. □ 

Up to now we have made no mention of the linear forms L\, . . . , L m , which are of course 
central in this result, their crucial property being degree-s independence. We shall now 
draw together the results proved in this section to obtain an estimate for exponential sums 
of a certain kind that involve degree-s independent systems. Recall that our eventual aim 
is to obtain upper bounds for expressions of the form \^ Xl ,...,x d YULi f{Li( x i, • • • , x d))\- We 
shall do this by using Theorem 14.31 to decompose / into a linear combination of high- 
rank polynomial phase functions plus some error terms. We shall show that the error 
terms can be ignored, so we will be left with a linear combination of terms of the form 
'^x 1 ,...,x d YliLii fi(Li(%i, • • • , Xd)) to estimate, where each fi is a high-rank polynomial phase 
function. The next lemma gives us an upper bound for the size of such a term. 

Proposition 5.12. Let m, s and k be positive integers with s < k, and for each i = 
1, . . . , m let it i : F™ — > ¥ p be a polynomial of degree between s and k and of rank at 
least R. Let L 1; . . . , L m be linear forms in d variables and suppose that they are degree-s 
independent. Then 



Proof. Let t be the maximal degree of the polynomials 7Tj, satisfying s < t < k. Let us treat 
all of the 7Tj as though they had degree t, with some (but not all) of them possibly having 
leading coefficient zero. The main difference this makes is that if 7Tj is a polynomial that in 
fact has degree less than t then we shall say that its rank is (as a degree-t polynomial), 
because the t-linear form associated with it will be the zero form. 

For each i, let Li be the linear form Li(x±, . . . , Xd) = Ylt=i °iu x u and let us write 7Tj as 
ftii 30 ) = Xlj=o K ij( x i x i ■ ■ ■ i x )i where is a symmetric j'-linear form. Then 




E£lTi(A(z)) < 



R/2 k {2m) 



d 



71 



j(Lj(xi, . . . , Xd)) — T^i Cy ^ c iu x u) — ^ ^ j 




u j=0 u 



u 
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where the sums over u are from 1 to d. Expanding out this expression, we get 

t 

^ ^ ^ Qui • • • Ciitjl^ij (^ui j • • • j -Euj) 
j=0 ui,...,Uj 

We shall be interested in the degree-t part of this, so let us write it as 

^ ^ Qui • • • Ciut ^it ) • • • j ^ut ) P(*^l i ■ ■ ■ i -Ed) j 
ui,...,u t 

where p is a polynomial in x±, . . . , Xd of degree less than t. Note that some of the Ku may 
be zero, but at least one n it has rank at least R. 

Given a multisubset V of [d] size t, let <r(V) be the set of all U G [d] 1 that give rise 
to V if their terms are written in increasing order. For example, if V — (2, 3, 3) then the 
elements of cr(V) are (2,3,3), (3,2,3) and (3,3,2). If U — (ui, . . . ,u t ), then let us write 
Cm for c iUl . . . c iut and Xy for (x Ul , . . . , x Ut ). HU and U' belong to the same set cr(V), then 
Ciu = cnj>, and also, since the forms Ku are symmetric, K it (xu) = Ku(xu<)- Therefore, we 
can regard cm and Ku(xu) as functions of V rather than of U if we wish. Writing Vt for 
the set of all multisubsets of [d] of size t, we also have 

^ ^ Qui • • • Ciut K'it ("^iti > • • • j %u t ) ^ ^ ^ ^ CiU^it (■££/) • 

ui,...,«i VeVt c/ex(V) 

If we now sum over i we find that the degree-t part of the polynomial function (xi, . . . , xj) i— > 
^^^(L^xi, . . . ,x d )) is 

m m 

t=i veVt c/6<t(V) veVt i=i 

where = |cr( V) |cjc/ for any U G cr(V). 

We would now like to apply Lemma 15.61 For this purpose, we need at least one of the 
multilinear functions Y^iLi c 'iv K n to have high rank. 



Claim. At least one of the multilinear functions YlT=i c 'iv K n has rank at least R/(2 



m) 



Proof. It is here that we use the linear independence of L\, . . . , L*, (implicitly exploiting 
Lemma 15.21) . By this we mean that the L\ are linearly independent when regarded as 
functions from F^ to F p . That is, if z = (z±, . . . , Zd) G F^, then we consider the function 
z i — y (X)«=i c iuZ u Y- This is a polynomial of degree t in d variables, and if V = (vi, . . . , Vt) 
is a multiset of size t, then the coefficient of z vi . . . z Vt is precisely c' iV . It follows that if we 
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define a matrix (d iV ), where i ranges from 1 to m and V ranges over Vt, then its m rows 
are linearly independent (since they give us the coefficients of the polynomials L\, . . . , L^). 

Since row-rank equals column-rank, we can find m multisets V\, . . . , V m in Vt such that 
the columns (c' iV .)™ =l are linearly independent. By Lemma 15.11] it follows that there exists 
j such that the rank of the multilinear map YlT=i c 'iVj n it l § a ^ least R/(2m) d , just as we 
wanted. This completes the proof of the claim. □ 

Since Vj has maximal size, it is in particular maximal. Therefore, the result follows from 
Lemma 15.61 □ 



6. Proof of our main conjecture in ¥1 



Our aim in this paper was to establish Conjecture 11.21 for all linear systems over F™ 
(provided that p is not too small). In other words, we set out to prove the following result. 



Theorem 6.1. Let Li,...,L m be a system of m linear forms in d variables in F" of 
Cauchy-Schwarz complexity k < p. Suppose that L 1; . . . , L m are degrees independent for 
some s < p. Then for every e > there exists c > with the following property. 
If f : F£ -> [-1, 1] is such that \\f\\ v . < c, then 



i=i 



< e. 



In other words, L\, . . . , L m has true complexity at most s — 1 . 



As we commented at the beginning of the paper, Green and Tao proved that the true 
complexity of L\, . . . , L m of Cauchy-Schwarz complexity k is at most k, and we observed 
in |GW09a] that if L\ , . . . , L s m are linearly dependent then the conclusion of Theorem 16.11 
is false. Therefore, if we choose the minimal possible s above, then either s = k + 1 and 
the theorem follows from the result of Green and Tao, or s < k. Thus, the assumption 
that s < p is not important once we know that k < p. 

The next result is essentially what Green and Tao proved in |GrT08b] , though the setting 
in that paper is rather more complicated because of the application to the primes. In any 
case, the proof is a sequence of applications of the Cauchy-Schwarz inequality combined 
with a judiciously chosen reparametrization of the linear system. 
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Theorem 6.2. Let fi, ■ ■ ■ , f m be functions defined on F" and let L 1; . . . , L m be a linear 
system of Cauchy-Schwarz complexity k < p consisting of m forms in d variables. Then 



^xG(F£) d 



< min H/illc/fe+i J[ \\fj\\oo- 



Let us now turn to the proof of Theorem 16. 11 We begin with a brief description of the 
general strategy. We are aiming to prove an upper bound for E xe ( F n)d YHLi /(-^i( x )) • Our 
first step is to decompose the first occurrence of / using Corollary 14.41 This allows us to 
write / = fM+gM + h®, where /W is a linear combination of polynomial phase functions 
of degrees between s and k and high rank, g^ is a function with very small U k+l norm, 
and h^ 1 ' has small L 2 norm. Having done this, we can rewrite the expression we are trying 
to estimate as 

m 

E x6(F » )d (/«(L m (x)) + 0W(L m (x)) + h^(L m (x))) H /(L,(x)) 

i=2 

which splits into three terms that we can estimate separately. 

In order to estimate the term involving h^ 1 ', we simply use the fact that < ||/i {1) || 2 , 

and that YYu=2 /(-^( x )) takes values in [—1,1]. To estimate the term involving g^ we 
use Theorem 16.21 and the upper bound on ||g^||t/fc+i. That leaves us with our original 
expression, except that now the first / has been changed into an f^\ This represents a 
gain, in that f^ 1 ' is a linear combination of polynomial phase functions, which is what we 
want if we are to use Proposition 15.121 However, we also lose something, since when we 
throw away the low-rank polynomial phases, we no longer know that takes values in 
[—1,1]. However, we do have an upper bound on the sum of the absolute values of the 
coefficients of the functions that make up f^\ so we do at least have some upper bound 
M for H/^lloo- This means that we can play the same game with the second occurrence 
of /, as long as we replace e by e/M. 

Thus, we shall end up decomposing / in m different ways, each time using Corollary I4.4[ 
but asking for smaller and smaller error terms. When we have done this, we can get rid of 
everything except the linear combinations of polynomial phases. Having chosen the right 
bounds to make this possible, we then make sure that the ranks of the polynomial phases 
are large (by assuming that / has a sufficiently small II s norm to start with). 

In order to make this argument precise, we begin by running it without specifying the 
functions that we use to ensure that the ranks are large (which we can do as our high-rank 



decomposition result, Corollary 14. 4[ is true for arbitrary functions). We then work out 
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what these functions have to be in order for Proposition !5.12l to give small enough bounds 
for the contribution from the polynomial phases to be small. 

To do this, let , . . . , R (m) be functions with : R\ x R + ->■ R + . (Here, 
will depend on variables (M' 1 ',...,M",e). We shall think of it as a function of 
that is allowed to depend on the other variables.) Let rj = e and apply Corollary 14.41 
to write / as + g^ 1 ' + where is a linear combination £\ \jU) Wj such that 
^■IAj-I = MW < M^(R^\e), each ttj has de eree between s and k and rank at least 
#«(MW,e), < e, and ||/iM|| 2 < e. 

Because it is very important, we remark that R^ is a function of and e, and Mg 1 '' is 
a function of e and the function R^ rather than the value taken by that function. In other 
words, if we specify R^ and e, then we already know what M^f* is, quite independently 
of We can then find that is less than Thus, what looks at first like a 

circularity is in fact not circular at all. 

Now let us continue. Suppose that we have applied Corollary 14.41 % — 1 times. On 
the zth occasion, we apply Corollary 14.41 again but with 77 and e replaced by = 
e(M^ . . . M(* -1 )) -1 . This time, the polynomial phases have coefficients with ab- 
solute values that sum to M® < m£\r®, . . . , M^~ l \ e) and have rank at least 
R^(M^\ . . . ,M«,e). We also have the estimates < e« and ||/i w || 2 < e W - 

Claim. Let f be decomposed as /W + gW + foW in m ways as just described. Then 



< 2me. 



Proof. For each g, let us estimate 

E x e( F ?r n n /(^(x))-E xe(F?)d n/ w (^(x))n/(^( x )) 

i<q— 1 i>q— 1 i<9 

which is equal to 

E x ew n/ (i) (^w)(^ (9) (^(x)) + h«(L ff (x))) n/(L,(x)) 

i<<3 i>q 

Since L q {x) is evenly distributed over F™, the contribution from the hS q > term is at most 
ll^lli rii<« ll/ W lloo < e (9) Y[ i<q M {i) = e. As for the contribution from the g iq) term, by 
Theorem Elit is at most H^H^+i U i<q ll/ W IU < U i<g = e - 

Since the quantity we are trying to estimate is the sum of the quantities we have just 
estimated, the claim follows from the triangle inequality. □ 
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It remains to prove that E xg( - F n)d YliLi /^(-^( x )) * s sma U- Since we have Proposition 
I5.12[ this is a question of making sure we choose the ranks appropriately. In a sense, 
this is a trivial matter, but it takes a small effort to check that the dependence of our 
various parameters is such that we really are free to choose the ranks to be as big as we 
need for the lemma to give us a good enough bound. 

It follows immediately from Proposition 15.121 and the triangle inequality that we will be 
done if we can choose the functions R^ in such a way that R® (MW , . . . , Af W ,e)>R = 
R(MW, M( m \e), where R is large enough for p -~m k ^m) d to be at mogt e ( M (i) _ _ _ m^)~ 

The difficulty we must deal with is that Ry) does not depend on M^ +1 ^ , . . . , , and it 
looks as though it needs to. 

We deal with this inductively as follows. Suppose that we have chosen the functions 

are now trying to choose R( % \ Let us define a sequence 
N^ i+1 \ . . . , as follows. We let N& i+1> > = M^ +l) (R {i+1 \ M {1 \ . . . , M«, e), then 

N (i,i+2) = M (i+2) ( J R( i+2 ),M( 1 ),...,M«,iV^ +1 )), and so on. A trivial induction shows 
that the are upper bounds for the when j > i, and they depend just on 

M^ l \ . . . , e, and the already chosen functions R^ t+1 \ . . . , R^ m \ Therefore, we can 
define R® (M« , . . . , M« , e) to be R(M^ , . . . , M« , N^> i+V > , . . . , iV^ m ) , e) . 

The total error incurred in this argument is of course (2m + l)e, but this is easily rectified 
by replacing e with e/(2m + 1) throughout. 

7. The off-diagonal case 

In this section, we briefly discuss a closely related question that can also be treated by 
our techniques. Recall that we initially set out to find the minimal k with the following 
property: if A is a subset of F™ of density 5 such that \\A — 5l\\uk is sufficiently small, 
then the density of x G (F™) d such that Lj(x) G A for i = 1, ... ,m is approximately 5 m . 
Lemma 11.31 allowed us to recast that as a question about functions: if we set / to be 
A — 51, then we know that / is bounded and ||/||{/* is small, and we want to be able to 
deduce that E x€ ( F n)d Yli^E /(-^«( x )) i s sman f° r every non-empty subset £c {1,2,..., m}. 
A necessary and sufficient condition on k turned out to be that the linear forms were 
degree-/? independent. 

What happens if we try to estimate the density of x such that Lj(x) € Ai for i = 1, ... ,m, 
where the sets A\, . . . , A m do not have to be equal? Associated with each set Aj will be 
its density 5i, and in this case we would like to find a necessary and sufficient condition on 
the sequence (ki, . . . , k m ) such that if \\Ai — (^lH^ is sufficiently small for every i, then 
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the density of x such that Lj(x) e Ai for i — 1, . . . , m is approximately Y\a=i We can 
this the off-diagonal case of the problem. 

We have not completely solved the off-diagonal case, but we do have a sufficient condition 
that generalizes the condition we obtained in the diagonal case in a natural way. The 
statement is as follows. 

Theorem 7.1. For every e > and every sequence (s±, . . . ,s m ) of positive integers there 
exists a constant c > with the following property. Let L±, . . . , L m be linear forms such that 
for every i < m it is impossible to write as a linear combination of the functions with 
j ^ i, and let A±, . . . , A m be subsets o/F™ such that A t has density Si and \\Ai — Sil\\u s i < c 
for every i < m. Then the density o/x such that Lj(x) e A, for every i differs from Y\^ =1 Si 
by at most e. 

As in the diagonal case, it is more convenient to work with a version of the result for 
functions that implies the sets version. 

Theorem 7.2. For every e > and every sequence (s±, . . . ,s m ) of positive integers there 
exists a constant c > with the following property. Let Li, . . . , L m be linear forms such 
that for every i < m it is impossible to write L*' as a linear combination of the functions 
Ly* with j 7^ i, and let fi,...,f m be functions from to C such that ||/i||oo < 1 o,nd 
\\fi\\u s i < c f or every i < m. Then 



< e. 



The fact that a result like this ought to be true was observed independently by Hamed 
Hatami and Shachar Lovett, who were able to prove it in F2 using the methods from 
[GW09a| in the cases that only required the inverse theorems for the U 2 and U 3 norms. In 
general, it is not too difficult to adapt the methods in the present paper to give the result 
in full generality. 

To prove Theorem 17.21 we need some slight strengthenings of some of the lemmas from 
§5j We begin with a lemma about matrices that we shall use instead of the statement that 
the row rank of a matrix is equal to its column rank. 



Lemma 7.3. Let A be an m x n matrix over a field F and suppose that it is not possible 
to express the ith row of A as a linear combination of the other rows. Then the column 
space of A contains the column vector with a 1 in the ith row and zeros everywhere else. 
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Proof. Without loss of generality i — 1. We shall attempt to use column operations to 
produce a matrix that has the desired column vector as its first column. In other words, 
we would like a 1 in the top left-hand corner and for all the other rows to begin with 0. 

Since the first row cannot be all zero, we can do Gaussian column operations to make 
it 1 in the first place and everywhere else. Note that even after doing these column 
operations it is still the case that the first row is not a linear combination of the remaining 
rows. Now let B be the matrix obtained by deleting the first row. We will be done if we 
can prove that the first column of B is a linear combination of the other columns. 

If it is not a linear combination of the other columns, then there must be a linear 
functional that vanishes on all the columns except for the first. Equivalently, there must 
be a linear combination of the rows of B that vanishes everywhere except in the first 
coordinate. But from that it follows that the first row of the modified matrix A is a linear 
combination of the rows of B, which contradicts our assumption. □ 

Next, we prove a generalization of Lemma [5.111 

Lemma 7.4. Let Ki,...,K m be multilinear forms of degree d and suppose that there is 
some r < m such that K r has rank at least R. Let B be an m x n matrix with entries 
bij G F p and suppose that the rth row of B is not a linear combination of the other rows. 
Then at least one of the multilinear forms rjj = has rank at least R/(2m) d . 

Proof. By Lemma [7731 we can find coefficients C\, . . . , c n such that Cjbij — 1 if % = r and 
otherwise. Furthermore, since the column vectors all live in F™, we can do this in such 
a way that at most m of these coefficients are non-zero. But in that case, 

c j r lj = 2^ C jbij K i = K n 
i i 3 

so we have written Kjy> CIS 9b linear combination of at most m of the forms rjj. If r is the 
maximum rank of any rjj, it follows from Corollary 15.101 that n r has rank at most (2m) d r. 
The result follows. □ 

Next, we need a generalization of Proposition 15.121 

Proposition 7.5. Let k and m be positive integers. For each i < m let ki be a positive 
integer less than or equal to k, and let 7Tj : F™ — > ¥ p be a polynomial of degree ki. Suppose 
also that there is some r such that ir r has rank at least R and k r is at least as big as every 
other ki. Let Li, . . . , L m be linear forms in d variables and suppose that L k r r is not in the 
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linear span of the other functions L\ r . Then 




E£lTi(A(z)) < 



R/2 k (2m) 



d 



Proof. We shall not give a complete proof. Instead, we shall just point out where the proof 
differs from the proof of Proposition 15.121 

A very slight difference occurs at the end of the third paragraph, where instead of saying 
"at least one Ku has rank at least R," it is now more appropriate to say that K rt has rank 
at least R. Note also that t = k r . 

The main difference, however, is that when it comes to proving the claim, we shall use 
Lemma 17.41 instead of Lemma 15. Ill We do not know that the functions L\ are linearly 
independent, but we do know that L* is independent of the other L\. From this it follows 
that if we define the matrix (c^y) just as before, the rth row will be independent of the 
other rows, which is what we need in order to be able to apply Lemma 17.41 We can now 
complete the proof by applying Lemma [5.61 just as before. □ 

It remains to discuss how the proof of Theorem 17.21 differs from the proof of Theorem 
16.21 A superficial difference is that we are looking at /j(Lj(x)) instead of f(Li(x)). A 
deeper difference is that when we split /, up into polynomial phases, the degrees of these 
phases are between Sj and k rather than between s and k. 

Exactly as in that proof, we reduce the task to proving a result in the case that the fa 
are polynomials of high rank. Furthermore, our assumption that each L*' is independent 
of all the other L^, which implies that L\ is independent of all the other L l - whenever 
t > Si, guarantees that the condition for Proposition 17.51 holds for each of these terms. 
This completes the proof of Theorem 17. 2\ and hence of Theorem 17.11 as well. 

It may be that a substantially stronger result than Theorem 17.21 is true: it could be 
enough if there is just one is independent of the other The evidence for this is 
that it is true in the case where all the are equal to some s and the system of linear 
forms has Cauchy-Schwarz complexity at most s. In that case all the polynomial phases in 
our decompositions have degree s, and the condition is that some L\ is independent of the 
other Lj, which is enough for our argument to work because the polynomial phase used in 
the decomposition of has maximal degree amongst all the polynomial phases. 

The simplest situation where the difficulty arises is if the Lj have Cauchy-Schwarz com- 
plexity 3 and we know that L\ is independent of the other L\. We would like it to be 
enough if f\ had a small U 2 norm, but to prove that we would have to decompose f\ into 
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quadratic and cubic phases, plus error terms, and we have trouble dealing with terms that 
involve the quadratic part of f\ and cubic parts of other f\. Thus, the following problem 
remains open. 

Problem 7.6. Let e > and let . . . , s m ) be a sequence of positive integers. Does there 
exist a constant c > with the following property? Let L\, . . . , L m be linear forms such 
that for some i < m it is impossible to write L^ as a linear combination of the functions 
Ly* with j ^ i, and let fi,...,f m be functions from F™ to C such that \\fi\\oo < 1 and 
\\fi\\u a i < c f or every i <m. Then 



< e. 



A second piece of evidence in favour of a positive answer is that there is a fairly natural 
example that would show that, if true, such a result would be best possible. We briefly 
sketch the example. 

Example 7.7. Let (si, . . . , s m ) be a sequence of positive integers. Let Lx, ■ ■ ■ , L m be linear 
forms such that for each i it is possible to write L^ as a linear combination of the functions 
Iff with j i. Let p be sufficiently large. Then for every c > there exist a positive integer 
n and functions fi, ■ ■ ■ , f m such that \\fi\\u s i < c for every i and 



E x6( F«)d JJ/i(L»(x)) 



1. 



Proof. Let ir s be the polynomial x i— > Y^=i x t (defined on F^). It can be checked that for 
fixed s the rank of 7r s tends to zero as n tends to infinity, and therefore that the U s norm 
of the function u ns tends to zero. 

Now let us choose coefficients G F p such that for each i we have ca ^ and 
CijLf = 0. The dependence assumption of the theorem guarantees that we can do 
this. Let fix, ■ ■ ■ , be coefficients that we shall choose in a moment, and for each % let 
fi be the function fi(x) = tljCji ' Ks i Note that the exponent is a linear combination 
of the polynomials ir s . We need ||/i||{7«< to be small, which it will be if the coefficient of 
ir Si is non-zero. We know that ca ^ 0, so it is enough if /Xj ^ and the sum of the fijCji 
over all j such that Sj = Sj does not equal — /ijCjj. If we choose the /ij randomly, then an 
easy probabilistic argument shows that for large enough p (depending on m only) there is 
a non-zero probability that we will never have any cancellation of this kind. 



LINEAR FORMS AND HIGHER-DEGREE UNIFORMITY FOR FUNCTIONS ON F™ 39 



We now claim that 



E xe(^n/i( L '( x )) 



1. 



To prove this, we first observe that if Ylj c ijL s / = 0, then Ylj^ji^si ° Lj) = as well. 
(Note that in the first equation we are thinking of Lj as a function from F^ to F p and in 
the second it is a function from (F™) d to F™.) To check this, one can expand out both sides. 
Therefore, 



hi j » 

which is zero, since the coefficients Cjj have been chosen to make the inner sum zero for 

every j. It follows that Yl"= ± fi(Li(x)) = 1 for every x, which proves the theorem. □ 

Another problem that remains annoyingly open is to show that the dependence of c 
on the other parameters in Theorems 16.11 and 17.21 cannot be too good. This would be a 
convincing argument that it was impossible to prove these theorems by some kind of clever 
transformation followed by multiple applications of the Cauchy-Schwarz inequality. We do 
not believe that such a proof exists, but it would be good to have more evidence for this. 

We end with the following simple case of this problem. 

Problem 7.8. Do there exist positive integers s and k and a degrees independent system 
of linear forms L±, . . . , L m with the following property? For every positive real number r 
there exists e > and functions fi : F™ — > C such that \\fi\\u s — t r f or every %, and yet 
\E xe{¥?r UT=iM^))\>e? 
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